Skip to main content

ArchiveBot: The Archive Team Crowdsourced Crawler

ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive (or other archive sites).

To use ArchiveBot, drop by #archivebot on EFNet. To interact with ArchiveBot, you issue commands by typing it into the channel. Note you will need channel operator permissions in order to issue archiving jobs. The dashboard shows the sites being downloaded currently.

There is a dashboard running for the archivebot process at http://www.archivebot.com.

ArchiveBot's source code can be found at https://github.com/ArchiveTeam/ArchiveBot.

660
RESULTS


web 660

PART OF
Archive Team
Web Crawls

TOPIC
foseti.wordpress.com 2
itunes.apple.com 2
twitter.com 2
184.180.244.41 1
3dblogger.typepad.com 1
ahkscript.org 1
approachingaro.org 1
arstechnica.com 1
ay-riders.speccy.cz 1
becominggaia.wordpress.com 1
bitcointalk.org 1
blakemasters.com 1
blog.42floors.com 1
blog.coinjar.io 1
blog.dispatch.cc 1
blog.do.com 1
blog.dopplr.com 1
blog.ioactive.com 1
blog.lazymeter.com 1
blog.muflax.com 1
blog.zapd.com 1
blogs.hbr.org 1
boringasheck.com 1
breadlabs.tumblr.com 1
buddhism-for-vampires.com 1
coinjar.io 1
commonsware.com 1
cursiveclojure.com 1
daily.muflax.com 1
darkmail.info 1
darkpatterns.org 1
developer.leapmotion.com 1
developers.seagate.com 1
do.com 1
fanart.lionking.org 1
files.chatnfiles.com 1
forsythia.net 1
forum.reverse4you.org 1
forums.eventscripts.com 1
forums.spiderbasic.com 1
forums.unrealengine.com 1
forums.vandyke.com 1
framebase.io 1
freeweibo.com 1
garyfung.ca 1
gospel.muflax.com 1
help.do.com 1
hintsforums.macworld.com 1
horseecomics.tumblr.com 1
ilovelilychouchou.com 1
irclog.perlgeek.de 1
isohunt.com 1
kennethfolkdharma.com 1
labs.enigma.io 1
ludios.org 1
malmen.org 1
math.eretrandre.org 1
mathematicalmulticore.wordpress.com 1
medium.com 1
mitchell.jp 1
mrlandsberry.weebly.com 1
msram.github.io 1
muflax.com 1
nakkaya.com 1
nathanic.org 1
ootbcomp.com 1
personal.inet.fi 1
plus.google.com 1
privacy.cryptoseal.com 1
sachachua.com 1
security.mongohq.com 1
sgvo.homestead.com 1
sharknet.us 1
sinocism.com 1
slatestarcodex.com 1
softwarethis.com 1
southforksecurity.com 1
springfieldfiles.com 1
storify.com 1
support.doctape.com 1
support.google.com 1
support.leapmotion.com 1
tekpub.com 1
thenewinquiry.com 1
theprofoundprogrammer.com 1
therafirmata.webklik.nl 1
thesmokinggun.com 1
thinkprogress.org 1
tug.org 1
upl-gravedigger.boo.jp 1
uproxy.org 1
us2.campaign-archive1.com 1
usemycomputer.com 1
utopia.duth.gr 1
venturebeat.com 1
vgoulet.act.ulaval.ca 1
vndb.org 1
warhammeronline.com 1
wekeroad.com 1
windowsmoviemakers.net 1
wristifyme.com 1
www.animenewsnetwork.com 1
www.cs.carleton.edu 1
www.devttys0.com 1
www.dopplr.com 1
www.eventbrite.com 1
www.facebook.com 1
www.fullscreenmario.com 1
www.getairo.com 1
www.gwern.net 1
www.healthcare.gov 1
www.interfluidity.com 1
www.kanjitomo.net 1
www.killers4hire.com 1
www.loper-os.org 1
www.magicthegatheringtactics.com 1
www.mcfallweb.com 1
www.microcodeconsulting.com 1
www.practicefusion.com 1
www.reddit.com 1
www.righto.com 1
www.segagagadomain.com 1
www.solarus-games.org 1
www.trunker.info 1
www.unitrunker.com 1
www.urbit.org 1
www.warhammeronline.com 1
www.washingtonpost.com 1
www.xoxohth.com 1
www.zapd.com 1
www.zhoutong.com 1
x.rubini.us 1
xkcdexplained.tumblr.com 1
zuckerbergfiles.org 1
SHOW DETAILS
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
39,017
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
34,730
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
Topics: 3dblogger.typepad.com, ahkscript.org, approachingaro.org, arstechnica.com,...
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
22,585
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
22,294
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
20,431
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
20,123
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
17,043
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
16,256
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
14,259
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,930
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
13,328
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
11,906
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
11,809
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
11,362
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
11,271
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical websites to bring copies into the Internet Archive Wayback machine.
ArchiveBot: The Archive Team Crowdsourced Crawler
by Archive Team
11,210
0
0
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.
ArchiveBot is an Archive Team service to quickly grab smaller at-risk or critical sites to bring copies into the Internet Archive Wayback machine.